Toronto Crime PredictionsΒΆ
Table of ContentsΒΆ
- Introduction
- Retrieve Data from API
- Preprocess the Data
- Create the Model
- Creating the Testing and Training Datasets - First Approach
- Testing Different Models - First Approach
- Optimizing the models - First Approach
- Creating the Testing and Training Datasets - Second Approach
- Testing Different Models - Second Approach
- Optimizing the models - Second Approach
- Auto Theft Models
- Total Count Models
- Apply Regression Chain Boosting Algorithm on RF Regressor
- Perform Hyper-Parameter Tuning on RF Regression Chain
- Apply ADA Boosting Algorithm on HBGB
- Perform Hyper-Parameter Tuning on ADA Boosted HBGB Regressor
- Perform Hyper-Parameter Tuning on RF Regressor
- Perform Hyper-Parameter Tuning on HBGB Regressor
- Create a Voting Ensemble Learning Model with the default RF and HBGB Models
- Results
- Visualizations based on current data
- Visualizations based on the Predictions
- Anticipated Crime Statistics for next six months of 2024
- Anticipated Total count of Crime Acitivities for upcoming three years
- Anticipated Crime Statistics for Upcoming Years with a Month Breakdown
- Anticipated Crime Statistics for 2025
- Anticipated Crime Statistics for 2026
- Anticipated Crime Statistics for 2027
- Summary and Conclusion
IntroductionΒΆ
Project Overview
| geometry | _id | AREA_ID | AREA_ATTR_ID | PARENT_AREA_ID | AREA_SHORT_CODE | AREA_LONG_CODE | AREA_NAME | AREA_DESC | CLASSIFICATION | CLASSIFICATION_CODE | OBJECTID | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | MULTIPOLYGON (((-79.38635 43.69783, -79.38623 ... | 1 | 2502366 | 26022881 | 0 | 174 | 174 | South Eglinton-Davisville | South Eglinton-Davisville (174) | Not an NIA or Emerging Neighbourhood | NA | 17824737.0 |
| 1 | MULTIPOLYGON (((-79.39744 43.70693, -79.39837 ... | 2 | 2502365 | 26022880 | 0 | 173 | 173 | North Toronto | North Toronto (173) | Not an NIA or Emerging Neighbourhood | NA | 17824753.0 |
| 2 | MULTIPOLYGON (((-79.43411 43.66015, -79.43537 ... | 3 | 2502364 | 26022879 | 0 | 172 | 172 | Dovercourt Village | Dovercourt Village (172) | Not an NIA or Emerging Neighbourhood | NA | 17824769.0 |
| 3 | MULTIPOLYGON (((-79.4387 43.66766, -79.43841 4... | 4 | 2502363 | 26022878 | 0 | 171 | 171 | Junction-Wallace Emerson | Junction-Wallace Emerson (171) | Not an NIA or Emerging Neighbourhood | NA | 17824785.0 |
| 4 | MULTIPOLYGON (((-79.38404 43.64497, -79.38502 ... | 5 | 2502362 | 26022877 | 0 | 170 | 170 | Yonge-Bay Corridor | Yonge-Bay Corridor (170) | Not an NIA or Emerging Neighbourhood | NA | 17824801.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 153 | MULTIPOLYGON (((-79.59037 43.73401, -79.58942 ... | 154 | 2502213 | 26022728 | 0 | 001 | 001 | West Humber-Clairville | West Humber-Clairville (1) | Not an NIA or Emerging Neighbourhood | NA | 17827185.0 |
| 154 | MULTIPOLYGON (((-79.51915 43.77399, -79.51901 ... | 155 | 2502212 | 26022727 | 0 | 024 | 024 | Black Creek | Black Creek (24) | Neighbourhood Improvement Area | NIA | 17827201.0 |
| 155 | MULTIPOLYGON (((-79.53225 43.73505, -79.52938 ... | 156 | 2502211 | 26022726 | 0 | 023 | 023 | Pelmo Park-Humberlea | Pelmo Park-Humberlea (23) | Not an NIA or Emerging Neighbourhood | NA | 17827217.0 |
| 156 | MULTIPOLYGON (((-79.52813 43.74425, -79.52721 ... | 157 | 2502210 | 26022725 | 0 | 022 | 022 | Humbermede | Humbermede (22) | Neighbourhood Improvement Area | NIA | 17827233.0 |
| 157 | MULTIPOLYGON (((-79.53396 43.76886, -79.53227 ... | 158 | 2502209 | 26022724 | 0 | 021 | 021 | Humber Summit | Humber Summit (21) | Neighbourhood Improvement Area | NIA | 17827249.0 |
158 rows Γ 12 columns
| geometry | NEIGHBOURHOOD_158 | |
|---|---|---|
| 0 | MULTIPOLYGON (((-79.38635 43.69783, -79.38623 ... | South Eglinton-Davisville (174) |
| 1 | MULTIPOLYGON (((-79.39744 43.70693, -79.39837 ... | North Toronto (173) |
| 2 | MULTIPOLYGON (((-79.43411 43.66015, -79.43537 ... | Dovercourt Village (172) |
| 3 | MULTIPOLYGON (((-79.4387 43.66766, -79.43841 4... | Junction-Wallace Emerson (171) |
| 4 | MULTIPOLYGON (((-79.38404 43.64497, -79.38502 ... | Yonge-Bay Corridor (170) |
| ... | ... | ... |
| 153 | MULTIPOLYGON (((-79.59037 43.73401, -79.58942 ... | West Humber-Clairville (1) |
| 154 | MULTIPOLYGON (((-79.51915 43.77399, -79.51901 ... | Black Creek (24) |
| 155 | MULTIPOLYGON (((-79.53225 43.73505, -79.52938 ... | Pelmo Park-Humberlea (23) |
| 156 | MULTIPOLYGON (((-79.52813 43.74425, -79.52721 ... | Humbermede (22) |
| 157 | MULTIPOLYGON (((-79.53396 43.76886, -79.53227 ... | Humber Summit (21) |
158 rows Γ 2 columns
| geometry | NEIGHBOURHOOD_158 | HOOD_158 | |
|---|---|---|---|
| 0 | MULTIPOLYGON (((-79.38635 43.69783, -79.38623 ... | South Eglinton-Davisville (174) | 174 |
| 1 | MULTIPOLYGON (((-79.39744 43.70693, -79.39837 ... | North Toronto (173) | 173 |
| 2 | MULTIPOLYGON (((-79.43411 43.66015, -79.43537 ... | Dovercourt Village (172) | 172 |
| 3 | MULTIPOLYGON (((-79.4387 43.66766, -79.43841 4... | Junction-Wallace Emerson (171) | 171 |
| 4 | MULTIPOLYGON (((-79.38404 43.64497, -79.38502 ... | Yonge-Bay Corridor (170) | 170 |
| ... | ... | ... | ... |
| 153 | MULTIPOLYGON (((-79.59037 43.73401, -79.58942 ... | West Humber-Clairville (1) | 1 |
| 154 | MULTIPOLYGON (((-79.51915 43.77399, -79.51901 ... | Black Creek (24) | 24 |
| 155 | MULTIPOLYGON (((-79.53225 43.73505, -79.52938 ... | Pelmo Park-Humberlea (23) | 23 |
| 156 | MULTIPOLYGON (((-79.52813 43.74425, -79.52721 ... | Humbermede (22) | 22 |
| 157 | MULTIPOLYGON (((-79.53396 43.76886, -79.53227 ... | Humber Summit (21) | 21 |
158 rows Γ 3 columns
Retrieve Data from APIΒΆ
[{'type': 'Feature',
'id': 246675,
'geometry': {'type': 'Point',
'coordinates': [-79.425761926, 43.6817690130001]},
'properties': {'OBJECTID': 246675,
'EVENT_UNIQUE_ID': 'GO-20213605',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 16,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 16,
'DIVISION': 'D13',
'LOCATION_TYPE': 'Parking Lots (Apt., Commercial Or Non-Commercial)',
'PREMISES_TYPE': 'Outside',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': '094',
'NEIGHBOURHOOD_158': 'Wychwood (94)',
'HOOD_140': '094',
'NEIGHBOURHOOD_140': 'Wychwood (94)',
'LONG_WGS84': -79.42576192637651,
'LAT_WGS84': 43.68176901263976}},
{'type': 'Feature',
'id': 246676,
'geometry': {'type': 'Point',
'coordinates': [5.6843418860808e-14, 5.08888749034163e-14]},
'properties': {'OBJECTID': 246676,
'EVENT_UNIQUE_ID': 'GO-20213400',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 16,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 4,
'DIVISION': 'D33',
'LOCATION_TYPE': 'Other Commercial / Corporate Places (For Profit, Warehouse, Corp. Bldg',
'PREMISES_TYPE': 'Commercial',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': 'NSA',
'NEIGHBOURHOOD_158': 'NSA',
'HOOD_140': 'NSA',
'NEIGHBOURHOOD_140': 'NSA',
'LONG_WGS84': 0,
'LAT_WGS84': 0}},
{'type': 'Feature',
'id': 246677,
'geometry': {'type': 'Point', 'coordinates': [-79.460110312, 43.721012854]},
'properties': {'OBJECTID': 246677,
'EVENT_UNIQUE_ID': 'GO-20211123',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 7,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 4,
'DIVISION': 'D32',
'LOCATION_TYPE': "Other Non Commercial / Corporate Places (Non-Profit, Gov'T, Firehall)",
'PREMISES_TYPE': 'Other',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': '031',
'NEIGHBOURHOOD_158': 'Yorkdale-Glen Park (31)',
'HOOD_140': '031',
'NEIGHBOURHOOD_140': 'Yorkdale-Glen Park (31)',
'LONG_WGS84': -79.46011031171706,
'LAT_WGS84': 43.72101285418029}}]
[{'OBJECTID': 246675,
'EVENT_UNIQUE_ID': 'GO-20213605',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 16,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 16,
'DIVISION': 'D13',
'LOCATION_TYPE': 'Parking Lots (Apt., Commercial Or Non-Commercial)',
'PREMISES_TYPE': 'Outside',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': '094',
'NEIGHBOURHOOD_158': 'Wychwood (94)',
'HOOD_140': '094',
'NEIGHBOURHOOD_140': 'Wychwood (94)',
'LONG_WGS84': -79.42576192637651,
'LAT_WGS84': 43.68176901263976},
{'OBJECTID': 246676,
'EVENT_UNIQUE_ID': 'GO-20213400',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 16,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 4,
'DIVISION': 'D33',
'LOCATION_TYPE': 'Other Commercial / Corporate Places (For Profit, Warehouse, Corp. Bldg',
'PREMISES_TYPE': 'Commercial',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': 'NSA',
'NEIGHBOURHOOD_158': 'NSA',
'HOOD_140': 'NSA',
'NEIGHBOURHOOD_140': 'NSA',
'LONG_WGS84': 0,
'LAT_WGS84': 0},
{'OBJECTID': 246677,
'EVENT_UNIQUE_ID': 'GO-20211123',
'REPORT_DATE': 1609477200000,
'OCC_DATE': 1609477200000,
'REPORT_YEAR': 2021,
'REPORT_MONTH': 'January',
'REPORT_DAY': 1,
'REPORT_DOY': 1,
'REPORT_DOW': 'Friday ',
'REPORT_HOUR': 7,
'OCC_YEAR': 2021,
'OCC_MONTH': 'January',
'OCC_DAY': 1,
'OCC_DOY': 1,
'OCC_DOW': 'Friday ',
'OCC_HOUR': 4,
'DIVISION': 'D32',
'LOCATION_TYPE': "Other Non Commercial / Corporate Places (Non-Profit, Gov'T, Firehall)",
'PREMISES_TYPE': 'Other',
'UCR_CODE': 2135,
'UCR_EXT': 210,
'OFFENCE': 'Theft Of Motor Vehicle',
'MCI_CATEGORY': 'Auto Theft',
'HOOD_158': '031',
'NEIGHBOURHOOD_158': 'Yorkdale-Glen Park (31)',
'HOOD_140': '031',
'NEIGHBOURHOOD_140': 'Yorkdale-Glen Park (31)',
'LONG_WGS84': -79.46011031171706,
'LAT_WGS84': 43.72101285418029}]
| OBJECTID | EVENT_UNIQUE_ID | REPORT_DATE | OCC_DATE | REPORT_YEAR | REPORT_MONTH | REPORT_DAY | REPORT_DOY | REPORT_DOW | REPORT_HOUR | ... | UCR_CODE | UCR_EXT | OFFENCE | MCI_CATEGORY | HOOD_158 | NEIGHBOURHOOD_158 | HOOD_140 | NEIGHBOURHOOD_140 | LONG_WGS84 | LAT_WGS84 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 246675 | GO-20213605 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 094 | Wychwood (94) | 094 | Wychwood (94) | -79.425762 | 43.681769 |
| 1 | 246676 | GO-20213400 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | NSA | NSA | NSA | NSA | 0.000000 | 0.000000 |
| 2 | 246677 | GO-20211123 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 7 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 031 | Yorkdale-Glen Park (31) | 031 | Yorkdale-Glen Park (31) | -79.460110 | 43.721013 |
| 3 | 246678 | GO-2021445 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 1 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 151 | Yonge-Doris (151) | 051 | Willowdale East (51) | -79.415293 | 43.778743 |
| 4 | 246679 | GO-20213400 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | NSA | NSA | NSA | NSA | 0.000000 | 0.000000 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 147546 | 396731 | GO-20241427047 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 16 | ... | 1430 | 100 | Assault | Assault | 071 | Cabbagetown-South St.James Town (71) | 071 | Cabbagetown-South St.James Town (71) | -79.373043 | 43.663195 |
| 147547 | 396732 | GO-20241427869 | 1719723600000 | 1719723600000 | 2024 | June | 30 | 182 | Sunday | 18 | ... | 2133 | 200 | Theft Over - Shoplifting | Theft Over | 027 | York University Heights (27) | 027 | York University Heights (27) | -79.464942 | 43.759469 |
| 147548 | 396733 | GO-20241423116 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 2 | ... | 1450 | 120 | Discharge Firearm With Intent | Assault | 144 | Morningside Heights (144) | 131 | Rouge (131) | -79.248477 | 43.837237 |
| 147549 | 396734 | GO-20241426669 | 1719723600000 | 1718859600000 | 2024 | June | 30 | 182 | Sunday | 15 | ... | 2132 | 200 | Theft From Motor Vehicle Over | Theft Over | 160 | Mimico-Queensway (160) | 017 | Mimico (includes Humber Bay Shores) (17) | -79.521053 | 43.616490 |
| 147550 | 396735 | GO-20241425318 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 11 | ... | 1430 | 100 | Assault | Assault | 018 | New Toronto (18) | 018 | New Toronto (18) | -79.513940 | 43.598831 |
147551 rows Γ 29 columns
Preprocess the DataΒΆ
Index(['OBJECTID', 'EVENT_UNIQUE_ID', 'REPORT_DATE', 'OCC_DATE', 'REPORT_YEAR',
'REPORT_MONTH', 'REPORT_DAY', 'REPORT_DOY', 'REPORT_DOW', 'REPORT_HOUR',
'OCC_YEAR', 'OCC_MONTH', 'OCC_DAY', 'OCC_DOY', 'OCC_DOW', 'OCC_HOUR',
'DIVISION', 'LOCATION_TYPE', 'PREMISES_TYPE', 'UCR_CODE', 'UCR_EXT',
'OFFENCE', 'MCI_CATEGORY', 'HOOD_158', 'NEIGHBOURHOOD_158', 'HOOD_140',
'NEIGHBOURHOOD_140', 'LONG_WGS84', 'LAT_WGS84'],
dtype='object')
| OBJECTID | EVENT_UNIQUE_ID | REPORT_DATE | OCC_DATE | REPORT_YEAR | REPORT_MONTH | REPORT_DAY | REPORT_DOY | REPORT_DOW | REPORT_HOUR | ... | UCR_CODE | UCR_EXT | OFFENCE | MCI_CATEGORY | HOOD_158 | NEIGHBOURHOOD_158 | HOOD_140 | NEIGHBOURHOOD_140 | LONG_WGS84 | LAT_WGS84 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 246675 | GO-20213605 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 94 | Wychwood (94) | 094 | Wychwood (94) | -79.425762 | 43.681769 |
| 1 | 246676 | GO-20213400 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 0 | NSA | NSA | NSA | 0.000000 | 0.000000 |
| 2 | 246677 | GO-20211123 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 7 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 31 | Yorkdale-Glen Park (31) | 031 | Yorkdale-Glen Park (31) | -79.460110 | 43.721013 |
| 3 | 246678 | GO-2021445 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 1 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 151 | Yonge-Doris (151) | 051 | Willowdale East (51) | -79.415293 | 43.778743 |
| 4 | 246679 | GO-20213400 | 1609477200000 | 1609477200000 | 2021 | January | 1 | 1 | Friday | 16 | ... | 2135 | 210 | Theft Of Motor Vehicle | Auto Theft | 0 | NSA | NSA | NSA | 0.000000 | 0.000000 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 147546 | 396731 | GO-20241427047 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 16 | ... | 1430 | 100 | Assault | Assault | 71 | Cabbagetown-South St.James Town (71) | 071 | Cabbagetown-South St.James Town (71) | -79.373043 | 43.663195 |
| 147547 | 396732 | GO-20241427869 | 1719723600000 | 1719723600000 | 2024 | June | 30 | 182 | Sunday | 18 | ... | 2133 | 200 | Theft Over - Shoplifting | Theft Over | 27 | York University Heights (27) | 027 | York University Heights (27) | -79.464942 | 43.759469 |
| 147548 | 396733 | GO-20241423116 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 2 | ... | 1450 | 120 | Discharge Firearm With Intent | Assault | 144 | Morningside Heights (144) | 131 | Rouge (131) | -79.248477 | 43.837237 |
| 147549 | 396734 | GO-20241426669 | 1719723600000 | 1718859600000 | 2024 | June | 30 | 182 | Sunday | 15 | ... | 2132 | 200 | Theft From Motor Vehicle Over | Theft Over | 160 | Mimico-Queensway (160) | 017 | Mimico (includes Humber Bay Shores) (17) | -79.521053 | 43.616490 |
| 147550 | 396735 | GO-20241425318 | 1719723600000 | 1719637200000 | 2024 | June | 30 | 182 | Sunday | 11 | ... | 1430 | 100 | Assault | Assault | 18 | New Toronto (18) | 018 | New Toronto (18) | -79.513940 | 43.598831 |
147551 rows Γ 29 columns
| OCC_MONTH | |
|---|---|
| 0 | January |
| 1 | January |
| 2 | January |
| 3 | January |
| 4 | January |
| ... | ... |
| 147546 | June |
| 147547 | June |
| 147548 | June |
| 147549 | June |
| 147550 | June |
147551 rows Γ 1 columns
| OCC_MONTH | |
|---|---|
| 0 | 1 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| ... | ... |
| 147546 | 6 |
| 147547 | 6 |
| 147548 | 6 |
| 147549 | 6 |
| 147550 | 6 |
147551 rows Γ 1 columns
| EVENT_UNIQUE_ID | NEIGHBOURHOOD_158 | HOOD_158 | LAT_WGS84 | LONG_WGS84 | PREMISES_TYPE | OCC_DATE | OCC_YEAR | OCC_MONTH | OCC_DAY | OCC_HOUR | MCI_CATEGORY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GO-20213605 | Wychwood (94) | 94 | 43.681769 | -79.425762 | Outside | 1609477200000 | 2021 | 1 | 1 | 16 | Auto Theft |
| 1 | GO-20213400 | NSA | 0 | 0.000000 | 0.000000 | Commercial | 1609477200000 | 2021 | 1 | 1 | 4 | Auto Theft |
| 2 | GO-20211123 | Yorkdale-Glen Park (31) | 31 | 43.721013 | -79.460110 | Other | 1609477200000 | 2021 | 1 | 1 | 4 | Auto Theft |
| 3 | GO-2021445 | Yonge-Doris (151) | 151 | 43.778743 | -79.415293 | Other | 1609477200000 | 2021 | 1 | 1 | 1 | Auto Theft |
| 4 | GO-20213400 | NSA | 0 | 0.000000 | 0.000000 | Commercial | 1609477200000 | 2021 | 1 | 1 | 4 | Auto Theft |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 147546 | GO-20241427047 | Cabbagetown-South St.James Town (71) | 71 | 43.663195 | -79.373043 | Apartment | 1719637200000 | 2024 | 6 | 29 | 23 | Assault |
| 147547 | GO-20241427869 | York University Heights (27) | 27 | 43.759469 | -79.464942 | Commercial | 1719723600000 | 2024 | 6 | 30 | 18 | Theft Over |
| 147548 | GO-20241423116 | Morningside Heights (144) | 144 | 43.837237 | -79.248477 | Outside | 1719637200000 | 2024 | 6 | 29 | 21 | Assault |
| 147549 | GO-20241426669 | Mimico-Queensway (160) | 160 | 43.616490 | -79.521053 | Outside | 1718859600000 | 2024 | 6 | 20 | 13 | Theft Over |
| 147550 | GO-20241425318 | New Toronto (18) | 18 | 43.598831 | -79.513940 | House | 1719637200000 | 2024 | 6 | 29 | 20 | Assault |
147551 rows Γ 12 columns
| Assault | Auto Theft | Break and Enter | Robbery | Theft Over | |
|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 0 | 0 |
| 1 | 0 | 1 | 0 | 0 | 0 |
| 2 | 0 | 1 | 0 | 0 | 0 |
| 3 | 0 | 1 | 0 | 0 | 0 |
| 4 | 0 | 1 | 0 | 0 | 0 |
| ... | ... | ... | ... | ... | ... |
| 147546 | 1 | 0 | 0 | 0 | 0 |
| 147547 | 0 | 0 | 0 | 0 | 1 |
| 147548 | 1 | 0 | 0 | 0 | 0 |
| 147549 | 0 | 0 | 0 | 0 | 1 |
| 147550 | 1 | 0 | 0 | 0 | 0 |
147551 rows Γ 5 columns
| EVENT_UNIQUE_ID | NEIGHBOURHOOD_158 | HOOD_158 | LAT_WGS84 | LONG_WGS84 | PREMISES_TYPE | OCC_DATE | OCC_YEAR | OCC_MONTH | OCC_DAY | OCC_HOUR | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GO-20213605 | Wychwood (94) | 94 | 43.681769 | -79.425762 | Outside | 1609477200000 | 2021 | 1 | 1 | 16 | 0 | 1 | 0 | 0 | 0 |
| 1 | GO-20213400 | NSA | 0 | 0.000000 | 0.000000 | Commercial | 1609477200000 | 2021 | 1 | 1 | 4 | 0 | 1 | 0 | 0 | 0 |
| 2 | GO-20211123 | Yorkdale-Glen Park (31) | 31 | 43.721013 | -79.460110 | Other | 1609477200000 | 2021 | 1 | 1 | 4 | 0 | 1 | 0 | 0 | 0 |
| 3 | GO-2021445 | Yonge-Doris (151) | 151 | 43.778743 | -79.415293 | Other | 1609477200000 | 2021 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 |
| 4 | GO-20213400 | NSA | 0 | 0.000000 | 0.000000 | Commercial | 1609477200000 | 2021 | 1 | 1 | 4 | 0 | 1 | 0 | 0 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 147546 | GO-20241427047 | Cabbagetown-South St.James Town (71) | 71 | 43.663195 | -79.373043 | Apartment | 1719637200000 | 2024 | 6 | 29 | 23 | 1 | 0 | 0 | 0 | 0 |
| 147547 | GO-20241427869 | York University Heights (27) | 27 | 43.759469 | -79.464942 | Commercial | 1719723600000 | 2024 | 6 | 30 | 18 | 0 | 0 | 0 | 0 | 1 |
| 147548 | GO-20241423116 | Morningside Heights (144) | 144 | 43.837237 | -79.248477 | Outside | 1719637200000 | 2024 | 6 | 29 | 21 | 1 | 0 | 0 | 0 | 0 |
| 147549 | GO-20241426669 | Mimico-Queensway (160) | 160 | 43.616490 | -79.521053 | Outside | 1718859600000 | 2024 | 6 | 20 | 13 | 0 | 0 | 0 | 0 | 1 |
| 147550 | GO-20241425318 | New Toronto (18) | 18 | 43.598831 | -79.513940 | House | 1719637200000 | 2024 | 6 | 29 | 20 | 1 | 0 | 0 | 0 | 0 |
147551 rows Γ 16 columns
| NEIGHBOURHOOD_158 | HOOD_158 | LAT_WGS84 | LONG_WGS84 | PREMISES_TYPE | OCC_DATE | OCC_YEAR | OCC_MONTH | OCC_DAY | OCC_HOUR | |
|---|---|---|---|---|---|---|---|---|---|---|
| EVENT_UNIQUE_ID | ||||||||||
| GO-20211000033 | West Queen West (162) | 162 | 43.646286 | -79.408568 | Commercial | 1622264400000 | 2021 | 5 | 29 | 21 |
| GO-2021100004 | Morningside Heights (144) | 144 | 43.807252 | -79.162903 | Outside | 1610773200000 | 2021 | 1 | 16 | 17 |
| GO-20211000054 | Moss Park (73) | 73 | 43.657067 | -79.374531 | Apartment | 1622264400000 | 2021 | 5 | 29 | 22 |
| GO-20211000193 | Fort York-Liberty Village (163) | 163 | 43.636618 | -79.399704 | Apartment | 1622264400000 | 2021 | 5 | 29 | 23 |
| GO-20211000248 | Eglinton East (138) | 138 | 43.737099 | -79.246230 | Outside | 1622264400000 | 2021 | 5 | 29 | 21 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| GO-20249997 | Junction-Wallace Emerson (171) | 171 | 43.668917 | -79.442637 | Outside | 1704085200000 | 2024 | 1 | 1 | 18 |
| GO-202499972 | Edenbridge-Humber Valley (9) | 9 | 43.672705 | -79.522472 | House | 1705208400000 | 2024 | 1 | 14 | 3 |
| GO-2024999786 | Flemingdon Park (44) | 44 | 43.718727 | -79.334948 | Apartment | 1714539600000 | 2024 | 5 | 1 | 0 |
| GO-2024999795 | Oakridge (121) | 121 | 43.691225 | -79.288346 | Commercial | 1715230800000 | 2024 | 5 | 9 | 13 |
| GO-2024999882 | Eglinton East (138) | 138 | 43.738856 | -79.238421 | Commercial | 1715230800000 | 2024 | 5 | 9 | 14 |
129217 rows Γ 10 columns
| Assault | Auto Theft | Break and Enter | Robbery | Theft Over | |
|---|---|---|---|---|---|
| EVENT_UNIQUE_ID | |||||
| GO-20211000033 | 0 | 0 | 1 | 0 | 0 |
| GO-2021100004 | 0 | 1 | 0 | 0 | 0 |
| GO-20211000054 | 1 | 0 | 0 | 0 | 0 |
| GO-20211000193 | 1 | 0 | 0 | 0 | 0 |
| GO-20211000248 | 1 | 0 | 0 | 0 | 0 |
| ... | ... | ... | ... | ... | ... |
| GO-20249997 | 0 | 1 | 0 | 0 | 0 |
| GO-202499972 | 0 | 1 | 0 | 0 | 0 |
| GO-2024999786 | 1 | 0 | 0 | 0 | 0 |
| GO-2024999795 | 1 | 0 | 0 | 0 | 0 |
| GO-2024999882 | 1 | 0 | 0 | 0 | 0 |
129217 rows Γ 5 columns
| EVENT_UNIQUE_ID | NEIGHBOURHOOD_158 | HOOD_158 | LAT_WGS84 | LONG_WGS84 | PREMISES_TYPE | OCC_DATE | OCC_YEAR | OCC_MONTH | OCC_DAY | OCC_HOUR | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GO-20211000033 | West Queen West (162) | 162 | 43.646286 | -79.408568 | Commercial | 1622264400000 | 2021 | 5 | 29 | 21 | 0 | 0 | 1 | 0 | 0 |
| 1 | GO-2021100004 | Morningside Heights (144) | 144 | 43.807252 | -79.162903 | Outside | 1610773200000 | 2021 | 1 | 16 | 17 | 0 | 1 | 0 | 0 | 0 |
| 2 | GO-20211000054 | Moss Park (73) | 73 | 43.657067 | -79.374531 | Apartment | 1622264400000 | 2021 | 5 | 29 | 22 | 1 | 0 | 0 | 0 | 0 |
| 3 | GO-20211000193 | Fort York-Liberty Village (163) | 163 | 43.636618 | -79.399704 | Apartment | 1622264400000 | 2021 | 5 | 29 | 23 | 1 | 0 | 0 | 0 | 0 |
| 4 | GO-20211000248 | Eglinton East (138) | 138 | 43.737099 | -79.246230 | Outside | 1622264400000 | 2021 | 5 | 29 | 21 | 1 | 0 | 0 | 0 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 129212 | GO-20249997 | Junction-Wallace Emerson (171) | 171 | 43.668917 | -79.442637 | Outside | 1704085200000 | 2024 | 1 | 1 | 18 | 0 | 1 | 0 | 0 | 0 |
| 129213 | GO-202499972 | Edenbridge-Humber Valley (9) | 9 | 43.672705 | -79.522472 | House | 1705208400000 | 2024 | 1 | 14 | 3 | 0 | 1 | 0 | 0 | 0 |
| 129214 | GO-2024999786 | Flemingdon Park (44) | 44 | 43.718727 | -79.334948 | Apartment | 1714539600000 | 2024 | 5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 129215 | GO-2024999795 | Oakridge (121) | 121 | 43.691225 | -79.288346 | Commercial | 1715230800000 | 2024 | 5 | 9 | 13 | 1 | 0 | 0 | 0 | 0 |
| 129216 | GO-2024999882 | Eglinton East (138) | 138 | 43.738856 | -79.238421 | Commercial | 1715230800000 | 2024 | 5 | 9 | 14 | 1 | 0 | 0 | 0 | 0 |
129217 rows Γ 16 columns
Create the ModelΒΆ
Creating the Testing and Training Datasets - First ApproachΒΆ
| EVENT_UNIQUE_ID | NEIGHBOURHOOD_158 | HOOD_158 | LAT_WGS84 | LONG_WGS84 | PREMISES_TYPE | OCC_DATE | OCC_YEAR | OCC_MONTH | OCC_DAY | OCC_HOUR | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GO-20211000033 | West Queen West (162) | 162 | 43.646286 | -79.408568 | Commercial | 1622264400000 | 2021 | 5 | 29 | 21 | 0 | 0 | 1 | 0 | 0 | 1 |
| 1 | GO-2021100004 | Morningside Heights (144) | 144 | 43.807252 | -79.162903 | Outside | 1610773200000 | 2021 | 1 | 16 | 17 | 0 | 1 | 0 | 0 | 0 | 1 |
| 2 | GO-20211000054 | Moss Park (73) | 73 | 43.657067 | -79.374531 | Apartment | 1622264400000 | 2021 | 5 | 29 | 22 | 1 | 0 | 0 | 0 | 0 | 1 |
| 3 | GO-20211000193 | Fort York-Liberty Village (163) | 163 | 43.636618 | -79.399704 | Apartment | 1622264400000 | 2021 | 5 | 29 | 23 | 1 | 0 | 0 | 0 | 0 | 1 |
| 4 | GO-20211000248 | Eglinton East (138) | 138 | 43.737099 | -79.246230 | Outside | 1622264400000 | 2021 | 5 | 29 | 21 | 1 | 0 | 0 | 0 | 0 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 129212 | GO-20249997 | Junction-Wallace Emerson (171) | 171 | 43.668917 | -79.442637 | Outside | 1704085200000 | 2024 | 1 | 1 | 18 | 0 | 1 | 0 | 0 | 0 | 1 |
| 129213 | GO-202499972 | Edenbridge-Humber Valley (9) | 9 | 43.672705 | -79.522472 | House | 1705208400000 | 2024 | 1 | 14 | 3 | 0 | 1 | 0 | 0 | 0 | 1 |
| 129214 | GO-2024999786 | Flemingdon Park (44) | 44 | 43.718727 | -79.334948 | Apartment | 1714539600000 | 2024 | 5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 129215 | GO-2024999795 | Oakridge (121) | 121 | 43.691225 | -79.288346 | Commercial | 1715230800000 | 2024 | 5 | 9 | 13 | 1 | 0 | 0 | 0 | 0 | 1 |
| 129216 | GO-2024999882 | Eglinton East (138) | 138 | 43.738856 | -79.238421 | Commercial | 1715230800000 | 2024 | 5 | 9 | 14 | 1 | 0 | 0 | 0 | 0 | 1 |
129217 rows Γ 17 columns
| HOOD_158 | OCC_YEAR | OCC_MONTH | Total_Count | |
|---|---|---|---|---|
| 0 | 0 | 2021 | 1 | 44 |
| 1 | 0 | 2021 | 2 | 34 |
| 2 | 0 | 2021 | 3 | 45 |
| 3 | 0 | 2021 | 4 | 26 |
| 4 | 0 | 2021 | 5 | 37 |
| ... | ... | ... | ... | ... |
| 6672 | 174 | 2024 | 2 | 15 |
| 6673 | 174 | 2024 | 3 | 7 |
| 6674 | 174 | 2024 | 4 | 17 |
| 6675 | 174 | 2024 | 5 | 12 |
| 6676 | 174 | 2024 | 6 | 13 |
6677 rows Γ 4 columns
| HOOD_158 | OCC_YEAR | OCC_MONTH | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | |
|---|---|---|---|---|---|---|---|---|---|
| 1895 | 49 | 2021 | 6 | 1 | 0 | 1 | 0 | 0 | 1 |
| 2229 | 58 | 2021 | 4 | 0 | 0 | 1 | 0 | 0 | 1 |
| 1909 | 49 | 2022 | 8 | 0 | 1 | 0 | 0 | 0 | 1 |
| 5208 | 140 | 2021 | 2 | 1 | 0 | 0 | 0 | 0 | 1 |
| 1890 | 49 | 2021 | 1 | 0 | 0 | 1 | 0 | 0 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 77 | 1 | 2023 | 12 | 30 | 45 | 28 | 4 | 13 | 118 |
| 68 | 1 | 2023 | 3 | 18 | 83 | 12 | 3 | 3 | 119 |
| 65 | 1 | 2022 | 12 | 24 | 67 | 15 | 4 | 10 | 120 |
| 71 | 1 | 2023 | 6 | 24 | 77 | 22 | 4 | 5 | 131 |
| 66 | 1 | 2023 | 1 | 18 | 91 | 14 | 3 | 7 | 133 |
6677 rows Γ 9 columns
| HOOD_158 | OCC_YEAR | OCC_MONTH | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2021 | 1 | 23 | 2 | 10 | 9 | 1 | 44 |
| 1 | 0 | 2021 | 2 | 24 | 1 | 8 | 1 | 0 | 34 |
| 2 | 0 | 2021 | 3 | 27 | 7 | 5 | 5 | 3 | 45 |
| 3 | 0 | 2021 | 4 | 16 | 1 | 3 | 3 | 3 | 26 |
| 4 | 0 | 2021 | 5 | 30 | 3 | 3 | 1 | 1 | 37 |
| 5 | 0 | 2021 | 6 | 22 | 3 | 3 | 2 | 3 | 32 |
| 6 | 0 | 2021 | 7 | 29 | 2 | 3 | 6 | 2 | 42 |
| 7 | 0 | 2021 | 8 | 48 | 6 | 2 | 7 | 3 | 66 |
| 8 | 0 | 2021 | 9 | 27 | 8 | 6 | 3 | 3 | 47 |
| 9 | 0 | 2021 | 10 | 39 | 9 | 15 | 2 | 2 | 65 |
| 10 | 0 | 2021 | 11 | 26 | 10 | 6 | 4 | 4 | 50 |
| 11 | 0 | 2021 | 12 | 22 | 5 | 8 | 2 | 1 | 37 |
| 12 | 0 | 2022 | 1 | 32 | 7 | 6 | 3 | 2 | 50 |
| 13 | 0 | 2022 | 2 | 30 | 10 | 3 | 1 | 1 | 45 |
| 14 | 0 | 2022 | 3 | 34 | 1 | 6 | 7 | 3 | 50 |
| 15 | 0 | 2022 | 4 | 28 | 8 | 3 | 11 | 3 | 52 |
| 16 | 0 | 2022 | 5 | 27 | 13 | 6 | 2 | 10 | 56 |
| 17 | 0 | 2022 | 6 | 25 | 7 | 4 | 4 | 2 | 41 |
| 18 | 0 | 2022 | 7 | 27 | 10 | 6 | 6 | 2 | 51 |
| 19 | 0 | 2022 | 8 | 28 | 6 | 6 | 6 | 6 | 52 |
| 20 | 0 | 2022 | 9 | 36 | 19 | 7 | 12 | 1 | 74 |
| 21 | 0 | 2022 | 10 | 35 | 10 | 5 | 7 | 2 | 58 |
| 22 | 0 | 2022 | 11 | 37 | 12 | 5 | 6 | 3 | 62 |
| 23 | 0 | 2022 | 12 | 25 | 12 | 2 | 4 | 1 | 43 |
| 24 | 0 | 2023 | 1 | 24 | 9 | 2 | 4 | 4 | 43 |
| 25 | 0 | 2023 | 2 | 17 | 12 | 1 | 5 | 2 | 37 |
| 26 | 0 | 2023 | 3 | 20 | 14 | 1 | 1 | 1 | 37 |
| 27 | 0 | 2023 | 4 | 17 | 12 | 1 | 2 | 1 | 33 |
| 28 | 0 | 2023 | 5 | 22 | 6 | 0 | 4 | 1 | 32 |
| 29 | 0 | 2023 | 6 | 13 | 13 | 0 | 0 | 1 | 27 |
| 30 | 0 | 2023 | 7 | 27 | 12 | 1 | 3 | 0 | 41 |
| 31 | 0 | 2023 | 8 | 15 | 11 | 1 | 1 | 1 | 29 |
| 32 | 0 | 2023 | 9 | 15 | 16 | 1 | 2 | 5 | 39 |
| 33 | 0 | 2023 | 10 | 20 | 14 | 0 | 1 | 1 | 36 |
| 34 | 0 | 2023 | 11 | 25 | 10 | 3 | 2 | 1 | 40 |
| 35 | 0 | 2023 | 12 | 24 | 5 | 0 | 6 | 0 | 34 |
| 36 | 0 | 2024 | 1 | 21 | 4 | 2 | 4 | 2 | 33 |
| 37 | 0 | 2024 | 2 | 17 | 4 | 0 | 4 | 1 | 26 |
| 38 | 0 | 2024 | 3 | 19 | 7 | 4 | 1 | 3 | 33 |
| 39 | 0 | 2024 | 4 | 19 | 5 | 1 | 1 | 0 | 26 |
| 40 | 0 | 2024 | 5 | 17 | 12 | 4 | 2 | 2 | 36 |
| 41 | 0 | 2024 | 6 | 20 | 7 | 0 | 4 | 1 | 31 |
| HOOD_158 | OCC_YEAR | OCC_MONTH | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2021 | 1 | 18 | 35 | 7 | 1 | 3 | 62 |
| 1 | 1 | 2021 | 2 | 17 | 17 | 5 | 1 | 3 | 43 |
| 2 | 1 | 2021 | 3 | 15 | 20 | 8 | 6 | 6 | 54 |
| 3 | 1 | 2021 | 4 | 11 | 31 | 4 | 2 | 4 | 52 |
| 4 | 1 | 2021 | 5 | 18 | 26 | 9 | 5 | 4 | 62 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6630 | 174 | 2024 | 2 | 9 | 0 | 5 | 1 | 1 | 15 |
| 6631 | 174 | 2024 | 3 | 6 | 1 | 0 | 0 | 0 | 7 |
| 6632 | 174 | 2024 | 4 | 12 | 2 | 2 | 0 | 1 | 17 |
| 6633 | 174 | 2024 | 5 | 8 | 1 | 2 | 0 | 1 | 12 |
| 6634 | 174 | 2024 | 6 | 6 | 4 | 1 | 1 | 1 | 13 |
6635 rows Γ 9 columns
Testing Different Models - First ApproachΒΆ
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
MultiOutputRegressor(estimator=HistGradientBoostingRegressor())In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MultiOutputRegressor(estimator=HistGradientBoostingRegressor())
HistGradientBoostingRegressor()
HistGradientBoostingRegressor()
Lasso()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Lasso()
ExtraTreesRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ExtraTreesRegressor()
KNeighborsRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
KNeighborsRegressor()
ElasticNet()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ElasticNet()
RadiusNeighborsRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RadiusNeighborsRegressor()
/usr/local/lib/python3.10/dist-packages/numpy/core/numeric.py:407: RuntimeWarning: invalid value encountered in cast multiarray.copyto(res, fill_value, casting='unsafe')
Random Forests Results Mean Squared Error: 16.345516578762318 R-squared: 0.42185755406596 mean absolute error: 2.507420886075949
Histogram-Based Gradient Boosting Results Mean Squared Error: 13.53697402119441 R-squared: 0.5401264890283979 mean absolute error: 2.3624557106149555
Lasso Regressor Results Mean Squared Error: 56.06504159690181 R-squared: 0.0008593089048466821 mean absolute error: 4.053649104634542
Extra-Trees Regressor Results Mean Squared Error: 21.009235548523208 R-squared: 0.2633688550325657 mean absolute error: 2.8370112517580868
K-Nearest Neighbors Regressor Results Mean Squared Error: 22.812032348804497 R-squared: 0.3917381351580209 mean absolute error: 2.9212025316455694
Elastic Net Regressor Results Mean Squared Error: 55.83116626511688 R-squared: 0.0018104521500861837 mean absolute error: 4.0478073837331445
Radius Neighbors Regressor Results Mean Squared Error: 8.973691110784241e+34 R-squared: -1.48166511902087e+34 mean absolute error: 9729295397526136.0
Optimizing the models - First ApproachΒΆ
Perform Hyper-Parameter Tuning on the ModelsΒΆ
Hyper-Parameter Tuning was performed on the following models, since they yielded the highest R-squared scores:
- Random Forest Regressor
- Histogram-Based Gradient Boosting
- K-Nearest Neighbors Regressor
Iterations for Random Forest (RF) RegressorΒΆ
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 10, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.928958393105038
R-squared: 0.49663590673300156
mean absolute error: 2.6323390583657864
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 10, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.928958393105038
R-squared: 0.49663590673300156
mean absolute error: 2.6323390583657864
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.900895599495165
R-squared: 0.49718000005206253
mean absolute error: 2.6397685170607654
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 10, 'min_samples_leaf': 4, 'max_depth': 5}
Mean Squared Error: 37.38108382013141
R-squared: 0.24787742552140726
mean absolute error: 3.5582910596792368
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.900895599495165
R-squared: 0.49718000005206253
mean absolute error: 2.6397685170607654
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.900895599495165
R-squared: 0.49718000005206253
mean absolute error: 2.6397685170607654
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.900895599495165
R-squared: 0.49718000005206253
mean absolute error: 2.6397685170607654
Best Hyperparameters: {'n_estimators': 300, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 18.358640232112
R-squared: 0.49228192721420205
mean absolute error: 2.66464858496764
Best Hyperparameters: {'n_estimators': 250, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 18.348343979326263
R-squared: 0.49257004515939246
mean absolute error: 2.6625904431614718
Best Hyperparameters: {'n_estimators': 450, 'min_samples_split': 15, 'min_samples_leaf': 8, 'max_depth': 10}
Mean Squared Error: 18.777789215683814
R-squared: 0.485768406447904
mean absolute error: 2.6973421196246625
Best Hyperparameters: {'n_estimators': 400, 'min_samples_split': 55, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 23.92497566088106
R-squared: 0.4239082506998882
mean absolute error: 2.9652493531921866
Best Hyperparameters: {'n_estimators': 100, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Mean Squared Error: 17.900895599495165
R-squared: 0.49718000005206253
mean absolute error: 2.6397685170607654
Iterations for Histogram-Based Gradient Boosting (HBGB)ΒΆ
Best Hyperparameters: {'estimator__max_depth': 7, 'estimator__learning_rate': 0.01, 'estimator__l2_regularization': 0.2}
Mean Squared Error: 32.18282924168508
R-squared: 0.3320401942652549
mean absolute error: 3.2649741520630413
Best Hyperparameters: {'estimator__min_samples_leaf': 40, 'estimator__max_depth': None, 'estimator__learning_rate': 0.01, 'estimator__l2_regularization': 0.0}
Mean Squared Error: 34.768348922898916
R-squared: 0.28056428760433166
mean absolute error: 3.322235227775795
Best Hyperparameters: {'estimator__min_samples_leaf': 10, 'estimator__max_depth': None, 'estimator__learning_rate': 0.01, 'estimator__l2_regularization': 0.1}
Mean Squared Error: 27.707811341396553
R-squared: 0.39456052080811127
mean absolute error: 3.123993487397762
Iterations for K-Nearest Neighbors (KNN) RegressorΒΆ
Best Hyperparameters: {'weights': 'distance', 'p': 1, 'n_neighbors': 9}
Mean Squared Error: 18.85456904070352
R-squared: 0.46409840345829956
mean absolute error: 2.643735813566883
/usr/local/lib/python3.10/dist-packages/sklearn/model_selection/_search.py:320: UserWarning: The total space of parameters 16 is smaller than n_iter=20. Running 16 iterations. For exhaustive searches, use GridSearchCV. warnings.warn(
Best Hyperparameters: {'weights': 'distance', 'p': 1, 'n_neighbors': 9}
Mean Squared Error: 18.85456904070352
R-squared: 0.46409840345829956
mean absolute error: 2.643735813566883
Best Hyperparameters: {'weights': 'distance', 'p': 1, 'n_neighbors': 11}
Mean Squared Error: 19.612298984089254
R-squared: 0.46018526487952816
mean absolute error: 2.677782519371975
Apply Boosting AlgorithmsΒΆ
MultiOutputRegressor(estimator=AdaBoostRegressor(estimator=HistGradientBoostingRegressor(random_state=1),
random_state=1))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MultiOutputRegressor(estimator=AdaBoostRegressor(estimator=HistGradientBoostingRegressor(random_state=1),
random_state=1))AdaBoostRegressor(estimator=HistGradientBoostingRegressor(random_state=1),
random_state=1)HistGradientBoostingRegressor(random_state=1)
HistGradientBoostingRegressor(random_state=1)
AdaBoostRegressor Mean Squared Error: 15.514865328785016 AdaBoostRegressor R-squared: 0.4603984456235 AdaBoostRegressor Mean Absolute Error: 2.5524177817460036
RegressorChain(base_estimator=RandomForestRegressor(max_depth=10,
min_samples_leaf=2,
min_samples_split=15,
random_state=1))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RegressorChain(base_estimator=RandomForestRegressor(max_depth=10,
min_samples_leaf=2,
min_samples_split=15,
random_state=1))RandomForestRegressor(max_depth=10, min_samples_leaf=2, min_samples_split=15,
random_state=1)RandomForestRegressor(max_depth=10, min_samples_leaf=2, min_samples_split=15,
random_state=1)Regression Chain Model Mean Squared Error: 14.907590528079728 Regression Chain Model R-squared: 0.5211560076522394 Regression Chain Model Mean Absolute Error: 2.4530789200162197
Creating the Testing and Training Datasets - Second ApproachΒΆ
Testing Different Models - Second ApproachΒΆ
Predicting Total CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Lasso()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Lasso()
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
ExtraTreesRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ExtraTreesRegressor()
KNeighborsRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
KNeighborsRegressor()
ElasticNet()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ElasticNet()
RadiusNeighborsRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RadiusNeighborsRegressor()
/usr/local/lib/python3.10/dist-packages/numpy/core/numeric.py:407: RuntimeWarning: invalid value encountered in cast multiarray.copyto(res, fill_value, casting='unsafe')
Random Forest Results Mean Squared Error: 47.507770147679324 R-squared: 0.7805812475805141 mean absolute error: 5.139229957805908
Histogram-Based Gradient Boosting Results Mean Squared Error: 42.3805269624916 R-squared: 0.8042618644469353 mean absolute error: 5.0338833270972545
Lasso Results Mean Squared Error: 215.0452817500199 R-squared: 0.006794735079959646 mean absolute error: 10.677136243599007
Extra-Trees Results Mean Squared Error: 61.829009704641344 R-squared: 0.7144373619188399 mean absolute error: 5.8675316455696205
K-Nearest Neighbors Results Mean Squared Error: 78.54392405063292 R-squared: 0.6372380818601189 mean absolute error: 6.74535864978903
Elastic Net Results Mean Squared Error: 214.33255721979504 R-squared: 0.010086515072044833 mean absolute error: 10.465525613654595
Radius Neighbors Results Mean Squared Error: 8.973691110784241e+34 R-squared: -4.144576986049706e+32 mean absolute error: 9729295397526140.0
Predicting Assault CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Random Forests Results Mean Squared Error: 18.884251160337556 R-squared: 0.7621670878841997 mean absolute error: 3.122795358649789
Histogram-Based Gradient Boosting Results Mean Squared Error: 16.26862998275755 R-squared: 0.7951088654730386 mean absolute error: 3.003620298129256
Predicting Auto Theft CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Random Forests Results Mean Squared Error: 18.344488924050633 R-squared: 0.31502817625221924 mean absolute error: 2.793428270042194
Histogram-Based Gradient Boosting Results Mean Squared Error: 12.029894217961466 R-squared: 0.5508112209565743 mean absolute error: 2.4878698854824157
Predicting Break and Enter CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Random Forests Results Mean Squared Error: 8.362676476793249 R-squared: 0.28879841796872596 mean absolute error: 2.052943037974684
Histogram-Based Gradient Boosting Results Mean Squared Error: 7.185335567450399 R-squared: 0.38892506039455554 mean absolute error: 1.9032188779372148
Predicting Robbery CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Random Forests Results Mean Squared Error: 2.590130696202532 R-squared: 0.21079584323330303 mean absolute error: 1.1071835443037976
Histogram-Based Gradient Boosting Results Mean Squared Error: 2.218185363946034 R-squared: 0.3241263414730897 mean absolute error: 0.9848720183410007
Predicting Theft Over CountsΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
RandomForestRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor()
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
HistGradientBoostingRegressor()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
HistGradientBoostingRegressor()
Random Forests Results Mean Squared Error: 1.4372389240506327 R-squared: 0.21472270183775177 mean absolute error: 0.8258755274261604
Histogram-Based Gradient Boosting Results Mean Squared Error: 1.139272032559403 R-squared: 0.3775255814261935 mean absolute error: 0.7612698567025898
Optimizing the models - Second ApproachΒΆ
Auto Theft ModelsΒΆ
Apply Regression Chain Boosting Algorithm on RF RegressorΒΆ
Regression Chain Random Forest Regressor (Auto Theft) Mean Squared Error: 18.3886164556962 Regression Chain Random Forest Regressor (Auto Theft) R-squared: 0.31338048162557164 Regression Chain Random Forest Regressor (Auto Theft) Mean Absolute Error: 2.798417721518987
Perform Hyper-Parameter Tuning on RF Regression ChainΒΆ
Fitting 5 folds for each of 20 candidates, totalling 100 fits
Best Parameters: {'base_estimator__n_estimators': 50, 'base_estimator__min_samples_split': 10, 'base_estimator__min_samples_leaf': 1, 'base_estimator__max_depth': 5}
Regression Chain Random Forest Regressor (Auto Theft) Mean Squared Error: 15.601535081978382
Regression Chain Random Forest Regressor (Auto Theft) R-squared: 0.4174483692289197
Regression Chain Random Forest Regressor (Auto Theft) Mean Absolute Error: 2.924723449937228
Apply ADA Boosting Algorithm on HBGBΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
AdaBoost Mean Squared Error: 18.659485596532548 AdaBoost R-squared: 0.30326639612753314 AdaBoost mean absolute error: 2.8809512187634048
Perform Hyper-Parameter Tuning on ADA Boosted HBGB RegressorΒΆ
Attempts were made to perform hyper-parameter tuning on ADA Boosted HBGB Regressor, but it took way too much time to run them and they did not yield siginficant R-Squared scores.
Total Count ModelsΒΆ
Apply Regression Chain Boosting Algorithm on RF RegressorΒΆ
Regression Chain Random Forest Regressor (Total Count) Mean Squared Error: 48.1800003164557 Regression Chain Random Forest Regressor (Total Count) R-squared: 0.7774764943051415 Regression Chain Random Forest Regressor (Total Count) Mean Absolute Error: 5.157500000000001
Perform Hyper-Parameter Tuning on RF Regression ChainΒΆ
Fitting 5 folds for each of 20 candidates, totalling 100 fits
Best Parameters: {'base_estimator__n_estimators': 100, 'base_estimator__min_samples_split': 10, 'base_estimator__min_samples_leaf': 1, 'base_estimator__max_depth': 10}
Regression Chain Random Forest Regressor (Total Count) Mean Squared Error: 66.41763382339101
Regression Chain Random Forest Regressor (Total Count) R-squared: 0.6932444038758028
Regression Chain Random Forest Regressor (Total Count) Mean Absolute Error: 6.172144837821778
Fitting 5 folds for each of 20 candidates, totalling 100 fits
Best Parameters: {'base_estimator__n_estimators': 100, 'base_estimator__min_samples_split': 10, 'base_estimator__min_samples_leaf': 1, 'base_estimator__max_depth': 10}
Regression Chain Random Forest Regressor (Total Count) Mean Squared Error: 68.88399843268999
Regression Chain Random Forest Regressor (Total Count) R-squared: 0.6818532852461192
Regression Chain Random Forest Regressor (Total Count) Mean Absolute Error: 6.253101443805737
Fitting 5 folds for each of 81 candidates, totalling 405 fits
Best Parameters: {'base_estimator__n_estimators': 150, 'base_estimator__min_samples_split': 5, 'base_estimator__min_samples_leaf': 1, 'base_estimator__max_depth': 10}
Regression Chain Random Forest Regressor (Total Count) Mean Squared Error: 71.02147143653406
Regression Chain Random Forest Regressor (Total Count) R-squared: 0.6719811809908387
Regression Chain Random Forest Regressor (Total Count) Mean Absolute Error: 6.345448931345114
Fitting 5 folds for each of 81 candidates, totalling 405 fits
Best Parameters: {'base_estimator__n_estimators': 100, 'base_estimator__min_samples_split': 2, 'base_estimator__min_samples_leaf': 1, 'base_estimator__max_depth': 10}
Regression Chain Random Forest Regressor (Total Count) Mean Squared Error: 69.62343491331661
Regression Chain Random Forest Regressor (Total Count) R-squared: 0.6784381337968257
Regression Chain Random Forest Regressor (Total Count) Mean Absolute Error: 6.2914368593362715
Apply ADA Boosting Algorithm on HBGBΒΆ
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
AdaBoost Mean Squared Error: 45.44923474580936 AdaBoost R-squared: 0.7900887716820576 AdaBoost mean absolute error: 5.161318291385233
Perform Hyper-Parameter Tuning on ADA Boosted HBGB RegressorΒΆ
Attempts were made to perform hyper-parameter tuning on ADA Boosted HBGB Regressor, but it took way too much time to run them and they did not yield siginficant R-Squared scores.
Perform Hyper-Parameter Tuning on RF RegressorΒΆ
Fitting 5 folds for each of 20 candidates, totalling 100 fits
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
Best Parameters: {'n_estimators': 150, 'min_samples_split': 5, 'min_samples_leaf': 2, 'max_depth': 10}
Random Forest Regressor (Total Count) Mean Squared Error: 71.17906332274143
Random Forest Regressor (Total Count) R-squared: 0.6712533292108969
Random Forest Regressor (Total Count) Mean Absolute Error: 6.347966311436887
Fitting 5 folds for each of 20 candidates, totalling 100 fits
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
Best Parameters: {'n_estimators': 50, 'min_samples_leaf': 2, 'max_depth': 10}
Random Forest Regressor (Total Count) Mean Squared Error: 68.1934671493194
Random Forest Regressor (Total Count) R-squared: 0.6850425638048225
Random Forest Regressor (Total Count) Mean Absolute Error: 6.233294034033055
Fitting 5 folds for each of 20 candidates, totalling 100 fits
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1473: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). return fit_method(estimator, *args, **kwargs)
Best Parameters: {'n_estimators': 50, 'min_samples_split': 15, 'min_samples_leaf': 2, 'max_depth': 10}
Random Forest Regressor (Total Count) Mean Squared Error: 70.81104307153397
Random Forest Regressor (Total Count) R-squared: 0.6729530626257474
Random Forest Regressor (Total Count) Mean Absolute Error: 6.383808010790825
Perform Hyper-Parameter Tuning on HBGB RegressorΒΆ
Fitting 5 folds for each of 20 candidates, totalling 100 fits
/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py:1339: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
Best Parameters: {'min_samples_leaf': 10, 'max_iter': 300, 'max_depth': 10, 'learning_rate': 0.01, 'l2_regularization': 0.0}
HistGradientBoostingRegressor (Total Count) Mean Squared Error: 68.39689337906064
HistGradientBoostingRegressor (Total Count) R-squared: 0.68410302213826
HistGradientBoostingRegressor (Total Count) Mean Absolute Error: 6.424647018374788
Create a Voting Ensemble Learning Model with the default RF and HBGB ModelsΒΆ
Voting Regressor with fitted RF and HBGB
/usr/local/lib/python3.10/dist-packages/sklearn/ensemble/_voting.py:694: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
Voting Regressor (Total Count) Mean Squared Error: 41.42631600357465 Voting Regressor (Total Count) R-squared: 0.8086689704319083 Voting Regressor (Total Count) Mean Absolute Error: 4.853770088554295
Voting Regressor with unfitted RF and HBGB
/usr/local/lib/python3.10/dist-packages/sklearn/ensemble/_voting.py:694: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
Voting Regressor (Total Count) Mean Squared Error: 41.182782954814 Voting Regressor (Total Count) R-squared: 0.8097937489168987 Voting Regressor (Total Count) Mean Absolute Error: 4.8440160089608995
Final ModelΒΆ
Hyper-Parameter Tuned Voting Regressor with unfitted RF and HBGB Regressors
Fitting 5 folds for each of 3 candidates, totalling 15 fits
/usr/local/lib/python3.10/dist-packages/sklearn/ensemble/_voting.py:694: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
Best Parameters for Voting Regressor: {'n_jobs': -1, 'weights': [1, 2]}
Voting Regressor (Total Count) Mean Squared Error: 40.66584193188937
Voting Regressor (Total Count) R-squared: 0.8121812858181674
Voting Regressor (Total Count) Mean Absolute Error: 4.862563006179832
| NEIGHBOURHOOD_158 | HOOD_158 | |
|---|---|---|
| 0 | West Queen West (162) | 162 |
| 1 | Morningside Heights (144) | 144 |
| 2 | Moss Park (73) | 73 |
| 3 | Fort York-Liberty Village (163) | 163 |
| 4 | Eglinton East (138) | 138 |
| ... | ... | ... |
| 1091 | Broadview North (57) | 57 |
| 1337 | Guildwood (140) | 140 |
| 1369 | Lambton Baby Point (114) | 114 |
| 1412 | Bayview Woods-Steeles (49) | 49 |
| 1778 | Woodbine-Lumsden (60) | 60 |
159 rows Γ 2 columns
| HOOD_158 | OCC_YEAR | OCC_MONTH | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | NEIGHBOURHOOD_158 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2021 | 1 | 18 | 35 | 7 | 1 | 3 | 62 | West Humber-Clairville (1) |
| 1 | 1 | 2021 | 2 | 17 | 17 | 5 | 1 | 3 | 43 | West Humber-Clairville (1) |
| 2 | 1 | 2021 | 3 | 15 | 20 | 8 | 6 | 6 | 54 | West Humber-Clairville (1) |
| 3 | 1 | 2021 | 4 | 11 | 31 | 4 | 2 | 4 | 52 | West Humber-Clairville (1) |
| 4 | 1 | 2021 | 5 | 18 | 26 | 9 | 5 | 4 | 62 | West Humber-Clairville (1) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6630 | 174 | 2024 | 2 | 9 | 0 | 5 | 1 | 1 | 15 | South Eglinton-Davisville (174) |
| 6631 | 174 | 2024 | 3 | 6 | 1 | 0 | 0 | 0 | 7 | South Eglinton-Davisville (174) |
| 6632 | 174 | 2024 | 4 | 12 | 2 | 2 | 0 | 1 | 17 | South Eglinton-Davisville (174) |
| 6633 | 174 | 2024 | 5 | 8 | 1 | 2 | 0 | 1 | 12 | South Eglinton-Davisville (174) |
| 6634 | 174 | 2024 | 6 | 6 | 4 | 1 | 1 | 1 | 13 | South Eglinton-Davisville (174) |
6635 rows Γ 10 columns
| NEIGHBOURHOOD_158 | HOOD_158 | OCC_YEAR | OCC_MONTH | Assault | Auto Theft | Break and Enter | Robbery | Theft Over | Total_Count | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | West Humber-Clairville (1) | 1 | 2021 | 1 | 18 | 35 | 7 | 1 | 3 | 62 |
| 1 | West Humber-Clairville (1) | 1 | 2021 | 2 | 17 | 17 | 5 | 1 | 3 | 43 |
| 2 | West Humber-Clairville (1) | 1 | 2021 | 3 | 15 | 20 | 8 | 6 | 6 | 54 |
| 3 | West Humber-Clairville (1) | 1 | 2021 | 4 | 11 | 31 | 4 | 2 | 4 | 52 |
| 4 | West Humber-Clairville (1) | 1 | 2021 | 5 | 18 | 26 | 9 | 5 | 4 | 62 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6630 | South Eglinton-Davisville (174) | 174 | 2024 | 2 | 9 | 0 | 5 | 1 | 1 | 15 |
| 6631 | South Eglinton-Davisville (174) | 174 | 2024 | 3 | 6 | 1 | 0 | 0 | 0 | 7 |
| 6632 | South Eglinton-Davisville (174) | 174 | 2024 | 4 | 12 | 2 | 2 | 0 | 1 | 17 |
| 6633 | South Eglinton-Davisville (174) | 174 | 2024 | 5 | 8 | 1 | 2 | 0 | 1 | 12 |
| 6634 | South Eglinton-Davisville (174) | 174 | 2024 | 6 | 6 | 4 | 1 | 1 | 1 | 13 |
6635 rows Γ 10 columns
ResultsΒΆ
Visualizations based on current dataΒΆ
Total count of Criminal Incidents for past three yearsΒΆ
Crime Statistics for Previous Years with a Month BreakdownΒΆ
Crime Statistics for 2021ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2021 | 849 |
| 93 | Moss Park (73) | 2021 | 778 |
| 36 | Downtown Yonge East (168) | 2021 | 774 |
| 156 | York University Heights (27) | 2021 | 562 |
| 125 | St Lawrence-East Bayfront-The Islands | 2021 | 527 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 78 | Lambton Baby Point (114) | 2021 | 42 |
| 56 | Guildwood (140) | 2021 | 48 |
| 150 | Woodbine-Lumsden (60) | 2021 | 51 |
| 113 | Princess-Rosethorn (10) | 2021 | 54 |
| 88 | Markland Wood (12) | 2021 | 58 |
Crime Statistics for 2022ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2022 | 1146 |
| 93 | Moss Park (73) | 2022 | 702 |
| 156 | York University Heights (27) | 2022 | 694 |
| 36 | Downtown Yonge East (168) | 2022 | 690 |
| 152 | Yonge-Bay Corridor (170) | 2022 | 604 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 56 | Guildwood (140) | 2022 | 53 |
| 9 | Bayview Woods-Steeles (49) | 2022 | 59 |
| 150 | Woodbine-Lumsden (60) | 2022 | 67 |
| 4 | Avondale (153) | 2022 | 70 |
| 64 | Humber Heights-Westmount (8) | 2022 | 73 |
Crime Statistics for 2023ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2023 | 1371 |
| 156 | York University Heights (27) | 2023 | 847 |
| 36 | Downtown Yonge East (168) | 2023 | 790 |
| 93 | Moss Park (73) | 2023 | 770 |
| 152 | Yonge-Bay Corridor (170) | 2023 | 717 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Count | |
|---|---|---|---|
| 150 | Woodbine-Lumsden (60) | 2023 | 58 |
| 78 | Lambton Baby Point (114) | 2023 | 68 |
| 56 | Guildwood (140) | 2023 | 83 |
| 64 | Humber Heights-Westmount (8) | 2023 | 83 |
| 88 | Markland Wood (12) | 2023 | 96 |
Comparison of Actual Data and the Predicted Data of 2024ΒΆ
<ipython-input-163-525f6f4de7e7>:6: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy predicted_crime_2024.loc[: , 'Predicted_Total_Count'] = y_voting_TC_pred
| NEIGHBOURHOOD_158 | HOOD_158 | OCC_YEAR | OCC_MONTH | Total_Count | Predicted_Total_Count | |
|---|---|---|---|---|---|---|
| 36 | West Humber-Clairville (1) | 1 | 2024 | 1 | 110 | 113.0 |
| 37 | West Humber-Clairville (1) | 1 | 2024 | 2 | 101 | 108.0 |
| 38 | West Humber-Clairville (1) | 1 | 2024 | 3 | 79 | 110.0 |
| 39 | West Humber-Clairville (1) | 1 | 2024 | 4 | 93 | 110.0 |
| 40 | West Humber-Clairville (1) | 1 | 2024 | 5 | 104 | 110.0 |
| ... | ... | ... | ... | ... | ... | ... |
| 6630 | South Eglinton-Davisville (174) | 174 | 2024 | 2 | 15 | 11.0 |
| 6631 | South Eglinton-Davisville (174) | 174 | 2024 | 3 | 7 | 14.0 |
| 6632 | South Eglinton-Davisville (174) | 174 | 2024 | 4 | 17 | 13.0 |
| 6633 | South Eglinton-Davisville (174) | 174 | 2024 | 5 | 12 | 13.0 |
| 6634 | South Eglinton-Davisville (174) | 174 | 2024 | 6 | 13 | 12.0 |
948 rows Γ 6 columns
OCC_MONTH Total_Count Predicted_Total_Count 0 1 3518 3189.0 1 2 3185 3008.0 2 3 3207 3340.0 3 4 3201 3406.0 4 5 3471 3732.0 5 6 3101 3676.0
Visualizations based on the PredictionsΒΆ
array([1, 2, 3, 4, 5, 6])
| NEIGHBOURHOOD_158 | HOOD_158 | OCC_YEAR | OCC_MONTH | |
|---|---|---|---|---|
| 0 | West Humber-Clairville (1) | 1 | 2024 | 7 |
| 1 | West Humber-Clairville (1) | 1 | 2024 | 8 |
| 2 | West Humber-Clairville (1) | 1 | 2024 | 9 |
| 3 | West Humber-Clairville (1) | 1 | 2024 | 10 |
| 4 | West Humber-Clairville (1) | 1 | 2024 | 11 |
| ... | ... | ... | ... | ... |
| 6630 | South Eglinton-Davisville (174) | 174 | 2027 | 8 |
| 6631 | South Eglinton-Davisville (174) | 174 | 2027 | 9 |
| 6632 | South Eglinton-Davisville (174) | 174 | 2027 | 10 |
| 6633 | South Eglinton-Davisville (174) | 174 | 2027 | 11 |
| 6634 | South Eglinton-Davisville (174) | 174 | 2027 | 12 |
6635 rows Γ 4 columns
| NEIGHBOURHOOD_158 | HOOD_158 | OCC_YEAR | OCC_MONTH | Total_Counts | |
|---|---|---|---|---|---|
| 0 | West Humber-Clairville (1) | 1 | 2024 | 7 | 111 |
| 1 | West Humber-Clairville (1) | 1 | 2024 | 8 | 111 |
| 2 | West Humber-Clairville (1) | 1 | 2024 | 9 | 108 |
| 3 | West Humber-Clairville (1) | 1 | 2024 | 10 | 110 |
| 4 | West Humber-Clairville (1) | 1 | 2024 | 11 | 107 |
| ... | ... | ... | ... | ... | ... |
| 6630 | South Eglinton-Davisville (174) | 174 | 2027 | 8 | 12 |
| 6631 | South Eglinton-Davisville (174) | 174 | 2027 | 9 | 13 |
| 6632 | South Eglinton-Davisville (174) | 174 | 2027 | 10 | 14 |
| 6633 | South Eglinton-Davisville (174) | 174 | 2027 | 11 | 14 |
| 6634 | South Eglinton-Davisville (174) | 174 | 2027 | 12 | 15 |
6635 rows Γ 5 columns
Anticipated Crime Statistics for next six months of 2024ΒΆ
Anticipated Total count of Crime Acitivities for upcoming three yearsΒΆ
| OCC_YEAR | Total_Counts | |
|---|---|---|
| 0 | 2025 | 42175 |
| 1 | 2026 | 42175 |
| 2 | 2027 | 42167 |
Anticipated Crime Statistics for Upcoming Years with a Month BreakdownΒΆ
| OCC_YEAR | OCC_MONTH | Total_Counts | |
|---|---|---|---|
| 0 | 2025 | 1 | 3189 |
| 1 | 2025 | 2 | 3008 |
| 2 | 2025 | 3 | 3340 |
| 3 | 2025 | 4 | 3406 |
| 4 | 2025 | 5 | 3732 |
| 5 | 2025 | 6 | 3676 |
| 6 | 2025 | 7 | 3659 |
| 7 | 2025 | 8 | 3711 |
| 8 | 2025 | 9 | 3575 |
| 9 | 2025 | 10 | 3632 |
| 10 | 2025 | 11 | 3685 |
| 11 | 2025 | 12 | 3562 |
| 12 | 2026 | 1 | 3189 |
| 13 | 2026 | 2 | 3008 |
| 14 | 2026 | 3 | 3340 |
| 15 | 2026 | 4 | 3406 |
| 16 | 2026 | 5 | 3732 |
| 17 | 2026 | 6 | 3676 |
| 18 | 2026 | 7 | 3659 |
| 19 | 2026 | 8 | 3711 |
| 20 | 2026 | 9 | 3575 |
| 21 | 2026 | 10 | 3632 |
| 22 | 2026 | 11 | 3685 |
| 23 | 2026 | 12 | 3562 |
| 24 | 2027 | 1 | 3189 |
| 25 | 2027 | 2 | 3008 |
| 26 | 2027 | 3 | 3332 |
| 27 | 2027 | 4 | 3406 |
| 28 | 2027 | 5 | 3732 |
| 29 | 2027 | 6 | 3676 |
| 30 | 2027 | 7 | 3659 |
| 31 | 2027 | 8 | 3711 |
| 32 | 2027 | 9 | 3575 |
| 33 | 2027 | 10 | 3632 |
| 34 | 2027 | 11 | 3685 |
| 35 | 2027 | 12 | 3562 |
Anticipated Crime Statistics for 2025ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2025 | 1322 |
| 156 | York University Heights (27) | 2025 | 789 |
| 36 | Downtown Yonge East (168) | 2025 | 760 |
| 93 | Moss Park (73) | 2025 | 747 |
| 152 | Yonge-Bay Corridor (170) | 2025 | 679 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 150 | Woodbine-Lumsden (60) | 2025 | 93 |
| 78 | Lambton Baby Point (114) | 2025 | 95 |
| 64 | Humber Heights-Westmount (8) | 2025 | 113 |
| 107 | Old East York (58) | 2025 | 113 |
| 19 | Broadview North (57) | 2025 | 115 |
Anticipated Crime Statistics for 2026ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2026 | 1322 |
| 156 | York University Heights (27) | 2026 | 789 |
| 36 | Downtown Yonge East (168) | 2026 | 760 |
| 93 | Moss Park (73) | 2026 | 747 |
| 152 | Yonge-Bay Corridor (170) | 2026 | 679 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 150 | Woodbine-Lumsden (60) | 2026 | 93 |
| 78 | Lambton Baby Point (114) | 2026 | 95 |
| 64 | Humber Heights-Westmount (8) | 2026 | 113 |
| 107 | Old East York (58) | 2026 | 113 |
| 19 | Broadview North (57) | 2026 | 115 |
Anticipated Crime Statistics for 2027ΒΆ
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 139 | West Humber-Clairville (1) | 2027 | 1322 |
| 156 | York University Heights (27) | 2027 | 789 |
| 36 | Downtown Yonge East (168) | 2027 | 760 |
| 93 | Moss Park (73) | 2027 | 747 |
| 152 | Yonge-Bay Corridor (170) | 2027 | 679 |
| NEIGHBOURHOOD_158 | OCC_YEAR | Total_Counts | |
|---|---|---|---|
| 150 | Woodbine-Lumsden (60) | 2027 | 85 |
| 78 | Lambton Baby Point (114) | 2027 | 95 |
| 64 | Humber Heights-Westmount (8) | 2027 | 113 |
| 107 | Old East York (58) | 2027 | 113 |
| 19 | Broadview North (57) | 2027 | 115 |